Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class detection works for huggingface checkpoints #1800

Merged
merged 2 commits into from
Aug 28, 2024

Conversation

mattdangerw
Copy link
Member

@mattdangerw mattdangerw commented Aug 27, 2024

Fixes #1798. You can now do keras_nlp.models.Backbone("hf://google-bert/bert-base-uncased") with a safetensors checkpoint, are we will find the correct architecture to instantiate. This has always worked for Keras-style checkpoints, but not safetensor ones.

This was a tricky one to fix that involved some large refactoring to our preset loading routine.

Originally the intent was that from_preset() was a easily readable bunch of lower-level Keras calls. With the arrival of transformers conversions, and soon timm conversions, I think that goal is no longer super realistic. Instead I added a loader interface, with default implementations off load_task and load_preprocessor. Every format we support directly converting from has to support at a minimum...

  • Detecting the backbone class.
  • Loading the backbone class.

One consequence of this work is that every class with a from_preset constructor needs to reference the backbone_cls they match with. I think this will be a more stable way to handle our "auto class" like functionality as we venture further towards multi-modal models

@mattdangerw mattdangerw force-pushed the from-preset-improvements branch 3 times, most recently from 893fad0 to c2fe6e8 Compare August 27, 2024 02:35
This was a tricky one to fix that involved some large
refactoring to our preset loading routines.

Originally the intent was that `from_preset()` was a easily
readable bunch of lower-level Keras calls. With the arrival
of transformers conversions, and soon timm conversions, I think
that goal is no longer super realistic.

Instead I added a loader interface, with default implementations
off `load_task` and `load_preprocessor`. Every format we support
directly converting from has to support at a minimum...
- Detecting the backbone class.
- Loading the backbone class.

One consequence of this work is that every class with a `from_preset`
constructor needs to reference the `backbone_cls` they match with. I
think this will be a more stable way to handle our "auto class" like
functionality as we venture further towards multi-modal models
Copy link
Member

@SamanehSaadat SamanehSaadat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks, Matt! I think the new design looks very nice!

keras_nlp/src/utils/preset_utils.py Outdated Show resolved Hide resolved
@mattdangerw mattdangerw merged commit 0c04abe into keras-team:master Aug 28, 2024
8 of 10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

from_preset issues for huggingface/transformers checkpoint converters
2 participants